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for Applied Mathematical and 
Statistical Research 


Wayn^ A. Woodward and H. L. Gray 


Center for Applied Mathematical and Statistical Research 
' Southern Methodist University 

Dallas, Texas 75275 


January 1983 


Final Report 


Establishment of a Center of Excellence 
. for Applied Mathematical and 
Statistical Research 

Introduction 

In this report, we will describe the research efforts which 
have been undertaken at Southern Methodist University (SMU) 
in support of contract MAS 9-16438. As the title of the 
contract states, a first priority has been the establishment 
of a "Center of Excellence" for directing and carrying out 
research in the area of Aerospace Remote Sensing. Such a 
center is needed in order to adequately organize and direct 
mathematical and statistical research in support of the 
AgRISTARS objectives. We have conducted a thorough 
assessment of the current state of the art (as defined by 
NASA and its contractors) with regard to estimation efforts 
in support of the crop production estimation problem. In 
particular, we have reviewed old methods and have evaluated 
methods in current use. 
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m. 


This review and evaluation process was facilitated 
through seminars in which methods were presented and 
discussed. A^ong the methods reviewed in this manner were: 

Proportion estimators from LACIE - analyst dependent 

(i) PC estimator 

(ii) Procedure 1 estimator 

(iii) etc. 

CLASSY/APEP ORIGINAL PAGE IS 
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AMOEBA/HISSE 

Procedure M 

Spatial/Color Sequence 

ERIM Profile Model 

Multitemporal Profile Modeling 

Others 

Reviews and evaluations have been presented as lengthy 
written reports, such as the report in Appendix A on the 
multitemporal profile modeling. Other reports have been in 
the form of written and oral reports delivered to the 

t 

project director and at workshop settings. 

Our second major effort has been in the area of 
development of alternative generic proportion estimation 
techniques. Of course, there is no distinct dividing line 
between the efforts involved in the two tasks. For example, 
as we developed alternative proportion estimation 
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techniques, we compared these with the existing techniques. 

* 

This provides us with further insight into the performance 
of the current procedures. 


Major Reports 

Our efforts have resulted in three major written reports 
which will be introduced in this section. These reports are 
included in the Appendix. During the early months of the 
contract, our major efforts were in the evaluation of 
current and former methods. At this time, G. Badhwar had 
introduced a procedure for modeling the multitemporal 
profile for a crop. It was believed that this profile 
(usually of "greenness") across the growing season would 
provide feature variables with superior discriminating 
power. Early results using this procedure showed that it had 
promise. We were asked to evaluate this procedure and make 
recommendations. Our report is included in Appendix A, and 
was presented at the January 1982 Quarterly Technical 
Interchange. Basically, we took a systematic look at the 
modeling of the greenness profile, and discussed the 
properties which such a model ’ should possess. Our major 
concern with the early Badhwar model was that in that model, 
emergence date, t Q , was not a location parameter. This 
concern was mentioned in discussions with Dr. Badhwar in 
October 1981. Recent modifications of the profile model have 


m 
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included emergence date as a location parameter, and we 

believe that our evaluations had an impact on these 

modifications. Various possible models for the profile are 
discussed in the report in Appendix A, along with results 
from both simulated and LANDSAT data. 

In Appendix B we include a report which is a 

compilation of results presented at both the April and 

October 1982 Quarterly Technical Interchanges and at the 

special mini-symposium at NASA in December, 1982. These 

results were also presented at a special session on remote 

sensing at the national meetings of the American Statistical 

Association in Cincinnati, August 1982 and were published in 

the Proceedings &£ J ihs. Section SliXXSZ 

Research Methods . This report has been distributed as 

Technical Report SR-62-04376, and it summarizes some of the 

results obtained in our second major effort, specifically, 

¥ 

the development of alternative generic proportion 
estimators. 

The mixture model is currently being used extensively 
by NASA and its contractors to obtain crop proportion 
estimates. CLASSY was an early result of this effort, and 
current investigations in this area are included in the APEP 
study headed ' by R. Heydorn. Parameter estimation in this 
mixture model is being accomplished using maximum likelihood 
(ML) techniques based upon an assumption that the underlying 
component distributions are normally distributed. Although 


ML estimators have desirable optimality properties when the 
underlying assumptions are valid, they are notoriously 
sensitive to departures from these underlying assumptions. 
It is our belief that the underlying normality assumption in 
the case of LANDSAT data is of questionable validity. For 
these reasons we investigated alternatives to ML estimation 
which were not as sensitive to departures from the 
underlying assumptions. Our investigations in this area have 
centered around minimum distance (MD) estimation. We 
conducted a simulation study in which the ML and MD 
estimators were compared on both mixtures of normal and of 
non-normal components. We have shown that MD estimators are 
competitive with ML estimators when the components actually 
are normal, while they tend to be superior when the 
components are non-normal yet symmetric. The non-normal 
model used is the Student's t with 4 degrees of freedom, 
and similar results have recently been obtained for the 
double exponential. Neither of these models is extremely 
non-normal. Thus even when the non-normality would probably 
not be detectable visually, the MD estimates are better than 
the "optimal” ML estimates. The results of this study are 
given in Appendix B. 

Although the results shown in Appendix B basically 
reflect research efforts in the area of development of 
generic proportion estimation techniques, they also involve 
an ’’evaluation” component. For example, we believe that the 
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results of our simulations provide much needed insight 
concerning the role of the normality assumption in the 
current implementations. For example, it was found that 
normal based estimation techniques often provide very biased 
estimates when the underlying distributions are actually 
skewed. For example, a 50-50 mixture of two chi-squared 
distributions will "confuse" the normal based procedures 
which assume that the underlying distributions are 
symmetric. This phenomenon is mentioned in Section 5 of the 
report in Appendix B. The problem of asymmetry is one of 
extreme concern since the variables currently being used in 
proportion estmation are feature variables from the profile 
models, and these variables have been shown to have 
asymmetric distributions. In Appendix C we have suggested an 
approach to the problem of obtaining proportion estimates 
when the underlying distributions are asymmetric. This 
report reflects material which was presented at the October 
1982 Quarterly Technical Interchange and at the December 
1982 mini-symposium. Briefly, instead of assuming that 
components are normally distributed, we have proposed that 
they be assumed to have Weibull distributions. This 
assumption is made since Weibull distributions are 
"flexible" in the sense that they can be either symmetric or 
asymmetric depending upon parameter configurations. 
Properties of the Weibull are summarized in Appendix C along 
with the proposed procedures for estimating the parameters 
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in a mixture of Weibulls. This procedure utilizes the MD 
techniques discussed in Appendix B. Although ML estimation 
is shown to be quite untractable in this setting# the MD 
estimators are relatively easy to obtain. The results in 
this report suggest that this Weibull assumption may prove 
to be a viable alternative to the procedures now in use. 


Future Research Directions 


In each of the 

reports in 

Appendices A-C, suggestions 

are 

made for future 

research. 

We 

refer the reader 

to 

those 

sections for a 

discussion 

of 

research topics 

which 

are 


suggested by the current results. 


Other Reports 

In Appendices D and E we include two other reports which 
were technical evaluations requested by the project 
directors. 
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A ?&«iporal Model For Crop Classification 
by 

I 

H. L. Gray and W. A. Woodward 


Introduction 

In a recent article G. D. Badhwar (1980) suggested a 
function P^Ct) for modeling the greenness spectral profile of 
a crop from emergence to harvest. The function p b (t) is 
defined as follows: 


P fe (t) ■ P 0 * 0 1 c 1 

P b (t) - PoCr 5- ) 01 e xp [-e(t 2 -tg)l, 1 q 1 t 

0 


(I) 


where 

p Q ■ Soil greenness 
tg ■ Emergence date 


and a and 8 are parameters to be estimated. 

By applying the Model I to Landsat spring wheat data for 
LACIE segments in North Dakota and Minnesota, Badhwar demonstrated 
that Model I could be used to successfully estimate tg in these 
cases. 

Badhwar (1979) and Badhwar, Carnes, and Austin (1981) have 
also applied the model in (I) to the problem of crop classification. 
It was demonstrated that a, 6 and tg could be used as features to 
correctly classify corn and soybeans. Again these methods were 
utilized on Landsat data, and the results were impressive on the 
data considered. Austin (1980), (1981) has reported on more 
extensive testing of these methods on LANDSAT data with the results 
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again being quite good. In this paper we examine (I) more closely 
from the perspective of a desirable mathematical model for describing 
crop greenness. Some shortcomings of Model I are noted and some 
modifications are proposed. It is shown how this modified model 
can be utilized for crop classification from LANDSAT data. The 
results are then demonstrated on some LANDSAT corn-soybean data. 

Analysis 

Even though a mathematical model may perform well on a selected 
number of data sets, it seems desirable that it also satisfy some of 
the more obvious physical constraints imposed by the phenomenon it 
seeks to explain. If this is not the case, i.e., if it does not 
satisfy such constraints, then i* behooves the investigator to 
explain why such constraints can be relaxed and the model still be 
expected to perform its function. 

Several properties which a function, p(t) for greenness should 
possess are 

(1) p(t) - p Q t 1 t Q 

(il) p(t) - p t ^ t /where p is terminal greenness 

TIT 

and is the corresponding point in time. 

(iii) p’(t) should be independent of pQ after full coverage. 

(iv) tp should be a location parameter, i.e., p should be a 
function of t - t^. 

Several other criteria could be listed, but the above suffice 
for the current discussion. The condition (iv) requires some 
comment. Certainly the same variety of crop planted at greatly 
differing times would be expected to have greenness character- 
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istics which differ in more wavs than simple translation. However, 

the model in (I) is posed for crops in the same segment and as such 

tae planting date c of the same crop are not expected to differ 

greatly even though it is possible. In any event it is the opinion 

of these authors that effects of t^, other than location effects, have 
to be relegated to noise in the model or treated as producing a 
different classification, not necessarily generlcally different 
but labeled different, spring wheat and winter wheat for example. 

In any event there is no reason to believe that the model in (I) 
speaks to this problem. Moreover, in (I) » £q is clearly not a 
location parameter. Note also that although P b (t) satisfies (i), 
it clearly fails to satisfy (ii) and (iii). 

Actually Model I represents a considerable simplification 
of the general model suggested by Badhwar in (1980). The following 
definition for p(t) makes use of that general model and the 
function 

E(t;ct»0) * t a exp (-3t^) , (1) 

demonstrated by Badhwar to be of some value in describing greenness. 
Let F(t) be a probability distribution function such that F(t) - 0 
for t 0 and F(t) » 1 for t _> A. Then define 

p(t) » [l-pF(t-t 0 )]p 0 + P F<t-t 0 )[ Pl + DE(t-t 0 ;ot,3)], (2) 

where 

p » proportion of ground covered for t A 
pg * soil greenness 

= crop greenness at terminal greenness 

tg = emergence date 

, a = greenup parameter 

0 = greendown parameter 

D = constant 


In (2) clearly 


a) p(t Q ) “ p Q 

b) p(t) •+> p Q + (p^-Pq))! h terminal greenness of the pixel 

c) if p » 1, p f (fc) is independent of Pq for t X 

d) t Q is a location parameter. 

Thus interpreting (b) as a satisfactory approximation to 
(ii) t we can say that the model in (2) satisfies conditions (i)- 
(iv) and makes use of important aspects of the exponential 
function found by Badhwar as a model for greenness. 

Unfortunately the model in (2) has 9 unknown parameters 
(assuming that the distribution F(t) has one unknown parameter)* 

Since the data to which we intend to apply our model consists 
of no more than 8 acquisitions, (2) is obviously not acceptable. 

The problem is complicated by the fact that it is desirable to 
classify the data as early as possible. Therefore, from a practical 
point of view, one can probably only count on 4 to 6 acquisitions 
before a classification must be mod»*, This clearly eliminates (2) 
as a practical model. 

Rather than abandon (2), we will now investigate the possi- 
bility of reducing the number of unknown parameters. In the pages 
which follow, we will investigate the affects of the simplication 
we impose. Since the data to be considered includes no information 
for separately estimating p, the model can with no loss in generality 
be rewritten as 

P(0 * p Q + [A + UE(t-t 0 ;«,6)] F(t-t 0 ) (3) 

where 

A « (p^ - p Q )p, B - pD 




and now A and B are the unknown parameters to be estimated. The 
model in (3), therefore, requires 8 parameters to be estimated, a 
reduction of 1. 

For the data to be considered there is no deleterious effect 
in going from (2) to (3) since no data are available from which to 
estimate p, p^, and D separately. It should be noted that (3) 
applies whether or not we, have full crop coverage (I.e. whether 
or not p * 1) . Of course the number of parameters in (3) is still 
too large to be useful. 

Investigation of LANDS AT Corn-Soybean data suggests that 
assuming F(t) to be the distribution function, associated with 
a uniform density over (0,A), yields a reasonable linear approxi- 
mation to F(t). Under this assumption we have 


F(t-t 0 ) 


0 

A-t, 


C 0 

c 0 ± c ± x 


(A) 


and (3) becomes 


A < t 


p(t) 


C 0 


t-t 


. a+1 


p 0 + A A-t 


0 + B exp <-0<t-t o ) 2 ) < 5) 


c 0 ± ± X 

p Q + A + B(t-t 0 )“ exp (-B(t-t Q ) 2 ) 
A < t. 


The model in (5) represents a reduction of one parameter over (3) since 
the parameter A is absorbed in the uniform distribution. From 
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(3), note that for tg <t< X 

p'(t) - AF'(t-t 0 ) + B[E'(t-t 0 ;o t 3)P<t-t 0 ) 

+ E(t-t 0 ;o,p)F»<t-t 0 )] . (6) 

But, if a > 0, 

E'(t-t 0 ;a,B)F(t-t 0 ) + E(t-t 0 ;a,$)F»(t-t 0 ) 

The left hand derivative of p(t) at tg is clearly zero. Therefore, 
the derivative of p(t) exists at tg if and only if 

AF'(tg) - 0 , (7) 

where here F'(tg) denotes the right hand derivative at tg. Since 
this seems desirable and F’(tg) ^ 0,we are left with requiring A » 0. 

Since p 1 t pg this is clearly incorrect. However, it does not seem 
that taking = Pq would seriously degrade the model’s ability to 
classify since p^ will probably not differ greatly from Pg, and p^ may be 
nearly constant from crop to crop. Essentially this error is due 
to our linear approximation of F(t), for if F(t) were quadratic 
the requirement that A = 0 could be eliminated. Nevertheless, 
for the reasons mentioned above, and the fact that it results in 
one less parameter, we now take A » 0 in (3) to obtain the model 
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Note that 
p'(t) 


B(a+l) , .a 

(t ‘ t o ) e 0 

' B ( t _ t 0 > “ +2 (- 26 ) e - 8 < t - C 0> 2 


X-t 


0 


ORIGINAL PAGE IS 
OF POOR QUALITY 


fpr t Q t < X 


«B(t-t 0 ) a “ 1 e“ e ^ t “ t: 0 ) -2SB(t-t 0 ) a+1 e“ P * t " t 0 ) for X < t 


Thus for p’(t) to exist at X we must have 

bu-o^V^'V 2 " 0 * 


( 8 ) 


Unfortunately this cannot occur so we must examine the model further. Since our 
desire is to simplify the model we do not wish to add additional terms 
to the model which would guarantee (8), especially for the purpose of 
fitting the curve in the right tail, since by that time the data will 
already be classified. It can be demonstrated numerically that B, X 
and a play similar roles in Model II and as a result are jointly very 
nonrobust to errors. This is particularly true of B and a. With 
only a few data points it, therefore, is desirable to fix a orB in 
advance . 

In other words, when there are only a few data points available, 
and there is error in the model, small differences in data values 
can lead to large differences in B and a. This is due to the fact 
that for fixed B or a a reasonable fit to the data can be obtained 
by varying the other. We thus let B = 1 and arrive at the following 


model 


p(t) « p, 


t ^ t 0 


- p 0 + (t -V a+ 1 e 


X-t 


0 


= P 0 + Ct-t 0 >“ 


fc o - c - X 


X < t , 


(III) 


where t Q , a, £ > 0 , X > t Q . 
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Model III Is a five parameter model, and therefore is a 
candidate for application on the data we will consider. Further 
simplifications of this model come to mind. For example, one might 
simply fix A as some maximum value. One might also argue that 
attempting to fit two curves together at a point 30 late in time 
as AfWith any degree of validity, requires data past the point 
of interest* and hence Model III should be modified to 


p(t) 


( P ° 

K + 


t < t. 


B(t-t 0 )“ exp (-0(t-t o ) Z ), t Q <_ t , 


(XV) 


where now B is again to be estimated. 

Moreover, again noting that B and a play much the same role, and 
that a classification is desired as soon as possible, It might further 
be argued that p(t) could be reduced to the four parameter model 


( p 0 

( Pq + (t-tg) a exp (-8(t-t Q ) 2 ) t Q £ t . (V) 


In the next section, we investigate via simulations the effects 
on classification of the above suggested simplifications. 

Feature Selection for Classification and Simulation 

Once an appropriate model has been obtained the problem of 
classification is not solved. The proper features to be utilized 
and the manner in which they are to be used must still be decided. 
Badhwar, Carnes, and Austin (1981) selected a, 8, and tg as the 
appropriate features and utilized thebe in the Ho-Kashyap algorithm 
(essentially the linear discriminant function) to form a discriminating 
plane. 


9 


ORIGINAL PAGE IS 

OF POOR QUA! ITY 

It seems reasonable that other "features" of the model obtained 
might also prove to be as useful, or more useful, than the model para- 
meters In separating crops. One feature which will be investigated 
in the present report is the maximum value of the fitted curve. If 

t is the Julian date at which this maximum occurs, then p(t ) is 
m in 

the corresponding feature of interest. In addition to the maximum 
greenness it appears that fc^- tg , i.e. the time from emergence to peak 
greenness, is also a feature of potential importance in the classifi- 
cation problem. As our investigations continue, we anticipate the 
examination of still other features, but in the present report we 
will examine only these two features in addition to the model para- 
meters as investigated by Badhwar, Carnes, and Austin. 

Performance of the Proposed Profile Models 

In this paper we have discussed a general profile model which 
we believe is appropriate for purposes of describing the greenness 
of a crop across time. However, the general expression for the model 
is such that estimation of the parameters would be impossible given 
the 5-8 observations typically available from LANDSAT observations. 

Thus, various simplif cations of this model were proposed (Models II-V) . 

In this section, we will discuss the results of our preliminary investi- 
gations into the performance of these models and Model I proposed by 
Badhwar . 

Our investigations have been primarily in two areas. First, we have 
utilized Models I-V in order to estimate parameters and features from 1978 
field data on corn and soybeans from Segment 882 in Palo Alto, Iowa. From 
the results of these investigations, we are able to find typical Model III 
parameters for corn and for soybeans. These parameters are then used 
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to simulate profile data from our "typical" corn model and soybean 
model, and investigate the performance of the various models based 
upon these simulations. Model III was used in the simulation since 
it was the most general model for which "typical" values of the 
parameters could be found. 

As mentioned earlier, we will not primarily be investigating 

the models with respect to estimation of the model parameters but 

for the purpose of ascertaining the effect on features such as 

t -t A and p(t ) which may be used for classification. In Table I, 
m 0 m 

the value of t is given for each of the models under consideration, 
m 


Table I - Julian Data (t^) of Maximum Greenness 
Associated with Models I-V 


Model 

I 


II, III 


IV, V 


m 


23 




C 0 + 

/a+1 

^ W" 

if 


C 0 + 

fH 

23 

if 


t Q + 

X 

if 

< x < 


ot+1 

23 


t + 

c 0 23 


For each model to be considered here, the parameter estima.tlon 
was accomplished using Marquardt's (1963) method for unweighted least 
Squares estimation of nonlinear parameters. 

i •» 

Parameter Estimation Utilizing 1978 Data from Segment 882 

In this section, we will report the parameter estimation results, 
based upon the utilization of Models I-V, for modeling the multitemporal 
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behavior of corn fields^ C01-C07 and soybean fields SY11 - SY17. In 
Figure 1, we liave plotted the five models obtained for soybean field 
SY. It^is^»Ateresting to note the various ’’interpretations" con- 
cerning the proper functional curve to fit to these eight points. 
Notice in particular the fact that Cq, the emergence date, varies 
considerably from model to model. 

In Table II we present the results of the parameter estimation 
based upon Models I-V. Several observations can be made concerning 
the results displayed in Table II. A first observation is that 
parameter estimates in Model II are less stable than those in the 
Model III, In Model II the parameter estimates of a and B are quite 
variable, a behavior which was discussed earlier in this report. 
Based upon the results for Model II and Model III, it would appear 
that indeed more stable estimates of a are obtained when B is set 
equal to a constant (in this case 1). It should be noted that the 
stability of B is also affected by the inclusion of B in the model, 
but not to the extent that a is affected. It appears that we simply 
do not have a sufficient number of readings to obtain reliable esti- 
mates of 6 parameters. It should be noted that the 1978 data for 
segment 882 contains 8 observations. Obviously, in most situations, 
as many as 8 observations will not be available and hence the need 
to find a satisfactory reduced model is clear. 

Using Model III, there is an indication that both a and 8 
are larger for soybeans than for corn, and reasonable separation 
between the two crops could be made using these two parameters. 

Also of interest is the fact that the estimation of the maximum 
greenness and t -t^ features in Model II are as stable as they are 


PftOf It 
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Table II - Parameter Estimation for Corn and 
Soybean Field Data - Segment 882 


MODEL I 


a 


A 




max t -t» 

m 0 


CORN 


SOYBEANS 


COl 

15.2 

1.16 

C02 

18.9 

2.14 

C03 

21.5 

2.42 

C04 

18.6 

2.13 

CO 5 

14.4 

1.60 

C06 

19.1 

2.21 

CO 7 

18.6 

2.15 


SY11 

24.9 

2.52 

SY12 

23.7 

2.54 

SY13 

21.7 

2.40 

SY14 

24.2 

2.58 

SY15 

24.9 

2.67 

SY16 

27.7 

2.80 

SY17 

26.9 

2.82 


146 

39.9 

68 

147 

43.8 

63 

149 

52.1 

62 

146 

44.8 

62 

134 

44.2 

78 

147 

43.8 

61 

146. 

43.9 

62 


165 

42.6 

57 

155 

52.1 

61 

152 

48.4 

61 

158 

49.9 

59 

158 

52.3 

58 

166 

55.6 

56 

163 

51.5 

55 


*8 - .0001 8 

c 
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MODEL II 


• 


A 

a 

A 

8 c 

A 

fc 0 

X-t Q 

A 

B 

A 

Max 

A 

t -t- 
ra 0 


C01 

.89 

2.08 

142 

125 

3.9 

40.6 

67 


CO 2 

-.15 

1.89 

158 

73 

153.9 

43.5 

48 


CO 3 

.39 

2.20 

154 

68 

23.3 

52.2 

56 

CORN 

C04 

-.23 

1.91 

158 

63 

220.7 

44.9 

45 


C05 

1.02 

1.38 

141 

45 

1.0 

44.6 

61 


C06 

.02 

2.20 

155 

76 

91.7 

44.0 

48 


CO 7 

.93 

2.76 

144 

123 

4.8 

45.5 

59 


SY11 

1.06 

2.03 

152 

70 

1.20 

41.7 

71 


SY12 

1.12 

2.51 

151 

80 

1.52 

51.7 

65 


SY13 

2.06 

2.85 

155 

0 

.03 

49.3 

60 

SOYBEAN 

SY14 

1.11 

2.49 

152 

79 

1.47 

49.2 

65 


SYl.S 

1.15 

3.13 

157 

74 

1.57 

52.5 

58 


SY16 

1.14 

2.78 

160 

71 

1.43 

55.3 

62 


SY17 

1.12 

2.64 

155 

76 

1.50 

50.8 

62 






MODEL III 






A 

ot 

A 

A 

*0 


A 

Max 

A 

t -t n 
m 0 


C01 

1.11 

1.83 

136 

95 

40.1 

76 


C02 

1.07 

2.07 

150 

39 

45.4 

51 


C03 

1.15 

2.35 

150 

48 

56.7 

49 

CORN 

C04 

1.07 

2.08 

151 

23 

45.5 

51 


COS 

1.01 

1.34 

141 

45 

44.4 

61 


C06 

1.07 

2.01 

146 

45 

45.9 

52 


CO 7 

1.06 

2.04 

149 

27 

44.6 

51 


SY11 

1.10 

1.98 

151 

80 

41.6 

73 


SY12 

1.24 

2.75 

152 

79 

52.2 

64 


SY13 

1.12 

2.06 

' 146 

62 

52.7 

62 

SOYBEAN 

SY14 

1.21 

2.59 

151 

80 

49.6 

65 


SY15 

1.26 

3.10 

155 

76 

52.5 

60 


SY16 

1.23 

2.80 

159 

72 

55.3 

63 


SY17 

1.23 

2.76 

154 

77 

51.1 

64 


f 
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CORN 
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MODEL IV 



a 

*c 

fc 0 

B 

Max 

t * 
m 

C01 

.70 

1.59 

158 

3.24 

40.1 

47 

CO 2 

1.10 

2.35 

156 

.93 

44.5 

48 

CO 3 

2.29 

3.00 

146 

.01 

53.2 

62 

C04 

.81 

2.13 

158 

2.72 

45.6 

44 

COS 

.51 

1.43 

158 

7.10 

44.7 

42 

CO 6 

1.08 

2.40 

155 

1.03 

44.6 

47 

CO 7 

.75 

2.11 

158 

3.41 

44.7 

42 

SY11 

3.24 

3.09 

148 

.0002 

42.5 

72 

SY12 

3.02 

3.13 

145 

.0005 

52.5 

69 

SY13 

2.98 

2.85 

138 

.0006 

49.3 

72 

SY14 

2.97 

3.41 

149 

.0007 

50.5 

66 

SY15 

3.00 

3.70 

151 

.0008 

53.0 

64 

SY16 

2.88 

3.46 

156 

= 0010 

55.6 

65 

SY17 

3.05 

o .q5 

150 

.0306 

51.7 

67 


MODEL V 



a 


fc 0 

Max 

t -t, 

m i 

C01 

1.02 

1.85 

155 

40.4 

52 

CO 2 

1.08 

2.34 

156 

44.5 

48 

CO 3 

1.14 

2.48 

158 

52.3 

48 

C04, 

1.09 

2.41 

155 

45.9 

48 

COS 

1.05 

1.86 

152 

45.1 

53 

CO 6 

1.09 

2.40 

155 

44,6 

48 

CO 7 

1.09 

2.46 

155 

45.1 

47 

SY11 

1.09 

2.87 

174 

42.4 

44 

SY12 

1.18 

3.23 

170 

53.2 

43 

SY13 

1.10 

2.25 

158 

48.3 

50 

SY14 

1.17 

3.41 

172 

51.1 

41 

SY15 

1.20 

3.69 

172 

53.8 

40 

SY16 

1.19 

3.21 

175 

55.6 

43 

SY17 

1.18 

3.35 

172 

52.1 

42 


SOYBEAN 
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in Model III. Thus, although the addition of the extra parameter B 
caused problems with the stability of the parameter estimation, the 
fitted curves were quite consistent at least with regard to the two 
features . From Models I and III we see that the maximum greenness 
is greater for soybeans than for corn, and that corn reaches its 
maximum greenness somewhat sooner than do soybeans. Again, reasonable 
separation between corn and soybeans could have been obtained based 
upon these two features for either Model II or Model III. The esti- 
mation of t Q is approximately equally stable for Models II and III, 
with a slight indication being given that soybeans emerged later 
than corn. (Thus at least at this point there appears to be no 
negative effect in going from Model II to III.) Also of interest 
is the fact that the addition of the parameter B has a tremendous 
effect on X - tg, i.e. the time from emergence to maximum crop 
coverage (as mentioned previously, maximum crop coverage need not 
be total coverage for our model to apply) . Based upon the data 
from Model III, it appears that time to total coverage, \ - t^, * 
is longer for soybeans than for corn. 

The comparison between Models IV and V &re similar to those 
between Models II and III. In particular, the inclusion of the 
parameter B in the model results in unstable estimates of both B 
and a. For these models, the general tendency is for a and 3 to 
be larger for soybeans than for com. The emergence date, t Q , 
is of considerable interest. For Model IV, there does not seem 
to be any difference between emergence date for corn and soybeans. 
However, for Model V the estimate of t q for soybeans is approximately 
170, which is significantly later than that for corn. Again, maximum 



original rage is 

OF POOR QUALITY 


22 


m 

ORIGINAL PAGE fS 

of peer y,<'i :r 

greenness and t m “t Q seem to be stable features for both models, 

with soybean attaining a larger value of greenness. The result 

of the late estimate of t in Model V Is to cause t -t ft to not 

0 m 0 

separate crops, whereas for Model IV this separation was apparent. 

The parameter estimation using Badhwar’s Model I was quite 
stable. Again, the tendency is for a and 0 to be larger for 
soybeans than for corn, emergence date to be later for soybeans, 
soybeans to attain a higher greenness, and for corn to attain its 
maximum greenness earlier than soybeans. 

Simulations 

In order to gain a better understanding of these models we 

have examined their performance in a simulation study. As a 

result of the parameter estimation study using Segment 882, we 

selected a typical set of corn parameters and a typical set of 

soybean parameters for Model III. These parameters and associated 

features are given In Table III. 

Table III - Parameters and Features of 
Corn and Soybean Models (Model III) 



Corn 

Soybe. 

p 0 

7.0 

7.0 

a 

1.07 

1.24 


2.07 

2.75 

fc 0 

150.0 

150.0 

l-t Q 

40.0 

80.0 

Max 

46.2 

52.0 

t -t n 
m 0 

51.0 

64.0 


*0 - .0001 0 

C 


w 
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Note that although there was an Indication that soybeans emerged 
somewhat later than corn on Segment 882, these simulations are 
based upon a common emergence date. One hundred realizations from 
each model were generated. 

The simulated observations were of the form 

p (t) - p(t) + 0 )(t)e(t) 
s 

where p(t) is as defined in Model III and e(t) is a normal random 
variable with zero mean and unit variance. Note that t(t) and 
e(t ? ) are independent if t yt t'. In the simulation results presented 
here we have also taken w(t)5 1. 

Models I-V were applied to each realization within a set and 
parameter estimates and features were obtained. Summary statistics 
describing the results of these simulations are presented in Table IV. 
For each parameter we indicate the average of the parameter values 
obtained over the 100 realizations, the coefficient of variation in 
order to provide an indication of relative variability of each 
parameter, and lower and upper .90 content tolerance limits with 95% 
level of confidence. In other words there is a 95% level of confi- 
dence that 90% of parameter estimates* obtained in this manner would 
fall between the two tolerance values given. These values will assist 
the reader in discerning the separability of the two crops on the 
basis of the given parameter. It should be noted that these 
tolerance limits are based upon an assumption that parameter esti- 
mates obtained in these ways will be normally distributed. This may 
or may not be a good assumption but nevertheless the tolerance limits 
given should provide crop separability information to the reader. 

The results of the simulations are similar to the results 
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Tables XV 


Model I 





A 

a 

** 

A 

£ 0 

A 

Max 

V'o 


X 

18.6 

2.10 

147.5 

44.6 

62.7 


CV 

.04 

.05 

,01 

.01 

.03 

CORN 

LTL 

17.2 

1.92 

143.6 

43.4 

58.8 


UTL 

20.0 

2.28 

151.4 

45.8 

66.6 


SOYBEANS x 

22.9 

2.48 

154.6 

51.9 

60.4 

CV 

.04 

.04 

.01 

.01 

.04 

LTL 

19.0 

2.28 

150.8 

50.8 

56.3 

UTL 

24.6 

2.68 

158.4 

53.0 

64.5 


*3 - .0001 3 

c 


Model IX 


CORN 


SOYBEANS 


A 



ot 

P c 

0 

X-t Q 

B 

Max 

t -t n 
m 0 

X 

1.11 

2.02 

152.1 

2.82 

2.05 

45.6 

53.2 

CV 

.33 

.22 

.04 

.61 

2.26 

.04 

.17 

LTL 

.42 

1.19 

141.9 

0.0 

0.0 

41.8 

36.6 

UTL 

*■ 

1.80 

2.85 

162.3 

60.7 

10.7 

49.4 

69.8 

X 

1,13 

2.69 

151.3 

78.3 

3.82 

52.1 

62.7 

CV 

.33 

.10 

.04 

.11 

2.17 

.01 

.09 

LTL 

.42 

2.21 

141.2 

61.9 

0.00 

50.8 

51.6 

UTL 

1.84 

3.17 

161.4 

94.7 

19.36 

53.4 

73.8 


Model III 


CORN 



A 

a 

A 

B c 

A 

£ o 

x * c o 

A 

Max 

A 

t -t n 
ra 0 

X 

1.06 

2.04 

152.3 

28.4 

45.6 

51.9 

CV 

.04 

.19 

.03 

.57 

.04 

.12 

LTL 

.98 

1.32 

144.5 

0.0 

42.5 

40.6 

UTL 

1.14 

2.76 

160.1 

58.9 

48.7 

63.2 

X 

1.22 

2.60 

149.8 

75.7 

52.6 

65.2 

cv 

.03 

.10 

.02 

.11 

.02 

.05 

LTL 

1.15 

2.13 

145.2 

59.5 

50.6 

58.9 

UTL 

1.29 

3.07 

154.4 

91.9 

54.6 

71.5 
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I 


Model IV 




A 

a 

A 

*c 

A 

c 0 

A 

B 

Max 

A 

t -t n 
ra 0 


X 

1.02 

2.33 

157.5 

1.76 

45.3 

46.4 


cv 

.18 

.08 

.02 

2.17 

.02 

.10 

CORN 

LTL 

.68 

1.97 

151.4 

0.00 

43.9 

38.0 


UTL 

1.36 

2.69 

163.6 

8.94 

46.7 

54.8 


X 

2.31 

3.19 

152.2 

.03 

52.7 

60.2 

SOYBEANS 

cv 

.12 

.06 

.03 

3.39 

.01 

.07 


LTL 

1.79 

2.84 

144.4 

.00 

51.5 

51.8 


UTL 

2.83 

3.54 

160.0 

.21 

53.9 

68.6 





Model V 






A 

A 

A 

A 

A 




ot 

S c 

c o 

Max 

t -t_ 
m 0 



X 

1.08 

2.41 

156.9 

45.5 

47.5 


CORN 

cv 

.01 

.05 

.01 

,01 

.02 



LTL 

1.06 

2.19 

155.1 

44.2 

45.5 



UTL 

1.10 

2.63 

158.7 

46.6 

49.5 



X 

1.18 

3.23 

168.5 

53.1 

42.7 


SOYBEANS 

cv 

.01 

.05 

.005 

.01 

.02 



LTL 

1.16 

2.91 

167.0 

51.8 

40.9 



UTL 

1,20 

3.55 

170.0 

54.5 

44.5 



H 



obtained Prow Segment 882 data. For* example , the estimation of ft 
in Models II and IV is quite® unstable as indicated by the large 
coefficients of variation. For all five models t maximum greenness 
ia a stable feature which seems to provide good separation between 
corn and soybeans. The time from emergence to maximum greenness 
ia not uni to m stable a feature as maximum greenness yet it seems 
to provide separation between crops for all models except Model I. 

The parameters o and ft tend to be larger for soybeans than for 
corn In all models « However, as seen in tbe Segment 882 data, 
the estimation of a is not as stable in Models XX and XV involving 
the ft parameter. In Models XI and XV the estimation of ft is very 
unstable, Also aft note is the fact that tbe estimation of \-tg 
in Models XX and XXX is not as stable as one would hope. In Models 
XX and XXX tbe estimated parameters can be compared with the true 
parameters given in Table XXX which were used in tbe simulations. 

In this situation, the most difficult parameter to estimate appears 
to be the parameter ft in Model XT. Data was generated from Model XXX 
which is Model XT with ft«l. However fitting Model IX to the data yields 
estimates for ft of 2. OS and 3.82 for corn and soybeans respectively. * 
In addition, \-t^ is seen to be difficult to estimate being signi- 
ficantly underestimated for corn. 

Of additional interest is the estimation of t^% There seems 
to be separation between crops based upon Cq for Model X and Model 
V. thin Is surprising since the true value for in the simulation 
model (Model Till was set at * ISO for both corn and soybeans* If 
Model XXX Is a reasonable good approximation to the true, growth 
model (and we believe it is) then differences in the estimation of 
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tg using models such as Models I and V may be due to adjustments 
which must be made in fitting a non-optimal model to a set of data. 
It seems that crop separation based upon tg must be viewed with 
caution. It is clear that if, for example, tg * 165 for corn and 
tg * 150 for soybeans and Model II were the appropriate model, then 
probably no separation between the two crops would be seen using 

A 

Model V on the basis of tg. 

A final observation will be made concerning the role of tg 
in Models I - V. Obviously in Models II-V, t Q is a location para- 
meter. As such, the shifting of each date in a set of observations 
by K will result in no change in the estimation of the other model 
parameters as long as the starting value for t Q is also shifted by 
K. However, tg in Model I is not a location parameter, and it is 
of importance to understand the effect on the remaining parameters 
of Model I which result from this shift by K. In Table V we 
illustrate these results for K - -10, 0/ 10, and 20. As an 
explanation of these results note for example that the 100 corn 
realizations which were analyzed by Models I-V in Table IV were 
again utilized here and the results for K * 0 are identical to 
those in Table IV. For K ■ -10, the 100 profile realizations 
remained unchanged yet the generated profile value for time t is 
now associated with time t-10, i.e. we have assumed that emergence 
date occurred 10 days earlier than t Q * 150. The corresponding 
parameter estimates in Table V are those estimates obtained by 
applying Model I to this augmented data set. Note that the 
starting value for t Q was also adjusted by -10. Similar proce- 
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TABLE V - The Effects of Shifts In t^ on the Parameters of Model I 


t n 

A 

a 

A 

8 * 

A 

Max 

rr 

1 » 
rr 

0 


c 


m 

150-10 

16.8 

2.10 

44.6 

62.5 

150 

18.6 

2.10 

44.6 

62.7 

150 + 10 

20.4 

2.10 

44.6 

62.9 

150 + 20 

22.3 

2.10 

44.6 

63.1 


150-10 

20.8 

2.43 

51.9 

60.2 

150 

22.9 

2.48 

51.9 

60.4 

150+10 

25.1 

2.48 

51.9 • 

60.6 

150+20 

27.3 

2.47 

51.8 

61.4 


*3 - .0001 5 

c 
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dures resulted in the remaining entries in Table V. Note that a 
is the only parameter significantly effected by this shift in 
emergence date. However, based upon Table V we see that soybeans 
with emergence date of tg ■ 140 would be relatively 'indistinguishable 
from com with emergence date tg « 160 on the basis of a. However, 
the separability associated with 3 and the "features" is still 
present. 

There is a final observation that should be made concerning 

A 

Table V and Model I. That is, from Table I, a appears to be a 

A 

monotonically increasing function of tg and visa versa. The impact 
of this is that late emergence dates give significantly larger values 

A A 

of a so that in this model a is certainly not a reliable feature. 

A 

The reverse is also true, i.e., tg is a monotonically increasing 

A A A 

function of a. Therefore large values of a give large values of tg. 
This is obviously highly undesirable and as a result, one could not 
expect reliable estimates of tg from Model I. 

The validity of this observation on actual data is born out by 

A 

inspecting Table XI. Note that a is nearly a monotonically increasing 

A 

function of tg. The pattern is also clear for soybeans, i.e. larger 

A A 

values of tg tend to give larger values for a. Thus whether from a 

A 

careful analysis of the actual data or the simulated date, a from 
Model X by itself should not be considered a viable parameter for use 
in discriminating Com and Soybeans. Moreover Model I should not be 
expected to produce reliable estimates of tg. 

Final Comments 


We believe that the results in this paper provide important 
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Information concerning both the development and the performance 
of various temporal profile models. It should be emphasised, 
however, that the results presented here are very preliminary in 
nature. Further investigation into the performance of those 
models is suggested in order to provide more experience with both 
real data and simulation. It is the opinion of the authors chat 

A A 

the performers of ‘'features” such as max and t -t„ should be 
r max 0 

investigated further. From the discussion in the previous section, 

A 

we definitely do not recommend using a in Modal I. Further investi- 
gations should also consider the problem of separability by finding 
discriminating surfaces based upon the utilisation of more than 
one parameter or feature* 
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A COMPARISON OF MINIMUM DISTANCE AND 
MAXIMUM LIKELIHOOD TECHNIQUES 
FOR PROPORTION ESTIMATION 


Wayne A. Woodward, William R. Schucany, 

Hildegard Lindsey, and H. L. Gray 
Center for Applied Mathematical and Statistical Research 
Southern Methodist University 

1. Introduction 

A common objective in remote sensing is the estimation 

of the proportions p,rP-f«*#P in the mixture density 

12m 

f(x) = p 1 f 1 (x) + p 2 f 2 (x) + ... + p m f m (x) (1.1) 

where m is the number of components (crops) in the mixture 
arid for component i,f^ (x) is a (possibly multivariate) 
density. In past practice this density has been assumed to 
be (multivariate) normal with X being the reflected energy 
in four bands of the light spectrum, certain linear 
combinations of these readings, or other derived "feature” 
variables. Generally the parameter estimation has been 
accomplished using maximum likelihood techniques. In this 
paper we examine the use of minimum distance estimation as 
an alternative to maximum likelihood and we will compare 
the performance of the two estimation techniques when 
dealing with mixtures of normal and of non-normal densities 
with varying amounts of separation. We will focus on the 
mixture of two univariate distributions given by 


f(x) = pf^x) + (l-p)f 2 (x) 


( 1 . 2 ) 
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We are also assuming that only data from the mixture 
distribution are available. Other sampling schemes in which 
training samples from the component distributions are also 
available have been discussed by Hosmer (1973) , 

Redner (I960) , and Hall (1981) among others. 

2. Estimation in the Mixture of Normals Model 

In this section we will assume that f^x) and f 2 ( x) in 

(1.2) are normal densities with mean and variance y , 0 ^ and 

y 2 , o 2 respectively where it is assumed that all five 
2 2 

parameters u lf a^ r U 2 f cr 2 , and p are unknown. Techniques for 
estimating these parameters will be discussed. 

(a) Maximum Likelihood 

Several recent articles have dealt with the problem of 

2 

obtaining the maximum likelihood estimates of y^ , , y 2 t 

cr 2 , and p (Hasselblad(1966) , Day(1969), Wolfe(1970), 

Hosmer (1975) , Fowlkes(1979) , Lennington and Rassbach(1979) , 
and Redner (1980) . ) Since the likelihood function 

L = f(x 1 )f(x 2 ) ... f(x n ) (2.1) 

where n is the sample size, is not a bounded function in 
this case (see Day(1969)), the objective in the maximum 
likelihood approach is to find a local maximum of L. This 
maximum is usually found by setting the partial derivatives 
of log(L) with respect to each of the 5 parameters equal to 
zero and solving the resulting set of equations, called the 
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likelihood equations. Since closed form solutions of these 

equations do not exist, they must be solved using iterative 

techniques. Hasselblad(1966) and Wolfe (1969) suggested that 

these equations be solved by taking advantage of their 

fixed point form. Redner(1980) and Redner and Walker (1982) 

have pointed out that this fixed point technique is 

essentially an application of the EM algorithm (see 

Dempster, Laird and Rubin(1977)) with the only difference 

2 

being that using the EM algorithm, the estimates of and 

2 th 

a 2 at step k involve the updated k step estimates of 

and y^ 

Fowlkes (1979) , on the other hand, maximized the 

likelihood function directly by utilizing a quasi-Newton 
method for minimizing -log(L) and found that good starting 
values were arucial for acceptable performance. 
Hosmer(1975) stated that using the likelihood equations, 
starting values were not a serious problem in ‘ his 

experience. In order to determine which of the two 

techniques seemed preferable in our simulation studies we 
replicated simulations performed by Fowlkes in which 
various sets of poor starting values were used to initiate 
the minimization procedure. We simulated realizations from 
the mixture utilized by Fowlkes and estimated the 

parameters using both direct maximization and the EM 
algorithm. The results of our simulations indicate that 
the EM algorithm approach is preferable and hence we have 
used this technique for obtaining MLEs in our simulations. 




4 
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Although ML estimation procedures are known to have 
certain optimality properties, their sensitivity to 
violations of the underlying assumptions is also 

recognized. The development of estimation procedures which 
perform well even under moderate deviations from 

assumptions has been a topic of major interest in recent 
literature. One of these robust procedures which has 
received recent attention is that of minimum distance (MD) 
estimation introduced by Wolfowitz (1957) . Parr and 
Schucany (1980) , for example, have shown that MD techniques 
provide robust estimators of the location parameter of a 
symmetric distribution. Minimum distance estimation has 
been used for parameter estimation in the mixture model by 
Choi and Bulgren (1968) and MacDonald (1971) with some 
success although, to our knowledge, the question of 
sensitivity to assumptions in this setting has not been 
addressed. These previous authors assumed that the 

parameters of the component distributions were known and 
that only the mixing proportion (s) was to be estimated. 

In order to briefly describe minimum distance 

estimation, we let x, ,X„ , . . . ,X denote a random sample from 

12' n 

a population with distribution ■ function P and let F n 
denote the empirical distribution function, i.e. F^(x)=k/n 
where k is the number of observations less than or equal 
to x. Further, let%= {H Q :0eft} denote a family of 
distributions depending on the possibly vector valued 
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parameter 0. The MD estimate of Q is that value of 0 for 
which the distance between p and H 0 is minimized. It is 
not necessary that Fr% Of course, when a mixture of two 
normals is used as the projection family, H 0 becomes 

x i ( y~ u iv 2 x i { y^2 

V x) = p / 7=r ® 1 °i. ay + a-p) / e 7 °2 

•a> a 1 m00 /2tt cfg 


Certain considerations become obvious at this point. 
First, we must define what we mean by the "distance" 
between two distributions. Several such distance measures 
have appeared in the literature. The reader is referred to 
the article by Parr and Schucany (1980) for a discussion of 
these measures. For our purposes we have chosen the 
Cramer-von Mises distance, W , between distribution 
functions G ^ and G 2 which is given by 


w 


2 _ 


* J[G 1 (x)-G 2 (x) rdG 2 (x) 


A 

In our setting a computing formula for the Cramer-von 

Mises distance 'between F and H. is given by 

n 0 




i- . 5 i 2 
n J 


where Y . is the ith order statistic. The similarity 
i 

between W 2 and the sum of squared differences between the 
n 

empirical distribution function F and H„ used by Choi and 

n 0 

Bulgren(1968) should be noted. 

Another consideration involves the minimization 
procedure to be employed in minimizing W *. Parr and 



m 
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Schucany used the IMSL quasi-Newton algorithm ZXMIN. Our 

comparisons have shown /however, that the IMSL routine 

ZXSSQ which uses Marquardt 1 s (1963) method for minimizing a 

sum of squares was significantly faster, usually taking no 

more than half the time required by ZXMIN. In the 

simulation studies reported in the next section we have 

used the Marquardt minimization procedure when calculating 

the MDE. It should be noted that minimization is subject 

2 2 

to the constraints cr^O , o 2 >_0 , and 0<p<l . Another finding 
which deserves mention before proceeding is that similar 
to the technique we have chosen for calculating the MLE, 
the MDE has the desirable property that it is relatively 
insensitive to starting values. 


3. Starting Values 

In order for the estimators discussed in the previous 
chapter to be used in practice, starting values for the 
iterative procedures must be provided. We have chosen to 
obtain starting values in this two component univariate 
setting using a partitioning technique which is very easy 
to implement. In the discussion to follow we will assume, 
without loss of generality, that u 1 <y 2 . This technique 
involves first obtaining the initial estimate of p, 
denoted by p Q , and then estimating the remaining four 
parameters given p Q . Under the current implementation, 
only the 9 values .1,.2,...,.9 are allowed as possible 
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values for p Q . For each allowable value of p Q , the sample 
is divided into two subsamples : 


Y i ' Y 


...,Y 


Y , Y , 

n - j+1 ^+2 


. . . , Y 


n 


where Y^ is the ith order statistic and n^ is np Q rounded 

to the nearest integer. The value for p Q is that value of 

2 

p for which p (1-p ) <m -m_) is maximized, where m. is 

i z j 

the sample median of the jth subsample. The criterion used 

here is a robust counterpart to the classical cluster 

analysis procedure of selecting the clusters for which the 

within cluster sum-of-squares is minimized. It is easy' to 

show, however, that the within cluster sum-of-squares is 

2 

minimized in the two cluster case when p(l-p) (x -YL ) is 

X ^ 

maximized, where 5T. is the sample mean of cluster i and 

D 

and p=n 1 /n with n 1 the number of sample values placed in 
cluster 1. Such a clustering is based upon a cut-point, 
c , for which all sample values below c are assigned to 
the cluster associated with population 1. It must be 
observed, however, that due. to the overlap between the two 
mixture distributions, some sample points assigned to 
cluster 1 may be from population 2 and some observations 
from population 1 -may be in cluster 2. The effect of this 
truncation of the right tail in population 1 is that the 

sample mean from cluster 1 is likely to underestimate u 1 

. 2 
while vu is likely to be overestimated. In addition and 

2 2 2 
a 2 are likely to be underestimated by and s 2 . If we 
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assume that the overlap between the two populations is not 
too severe, then the sample values in cluster 1 to the 
left of ra^ are relatively pure observations from 
population 1 in which case m^ is a "good" estimate of the 
population mean in the case of symmetric distributions. 
This reasoning also indicates that m and m, should 
provide better estimates of yt^ and u 2 than would "x^ and 
Xj • In order to estimate the variances of the component 

distributions we again will depend upon the fact that the 

values to the left of m^ and to the right of m 2 are "pure" 
samples from populations 1 and 2 respectively. Thus r we 

will use only this portion of the data for estimation of 

the sample variances. We have used the fact that the 
semi-interquartile range of a standard normal distribution 
is .6745, to estimate by 

? m. - r, <* 25) 2 

4 > ' 


where r^ is the q til percentile from the jth cluster, 
j-1,2. Similarly, ct 2 (q) - [ (r 2 ( * 75j ^\ 2 )/.6745] 2 . 

In the next section we will discuss the results of a 
major simulation investigation comparing ML and MD 
estimation, in these simulations the iterative techniques 
were initiated by the starting values as discussed in the 
previous paragraph. A preliminary simulation investigated 
the performance of the starting values described here. In 
this preliminary study we compared the convergence 
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initiated from these starting values with that when the 
iterative procedures are started at the true parameter 
values. The convergence from these two starts was almost 
always to the same parameter estimates, a result which 
held for both the MLE and MDE. For this reason and results 
to be shown in Section 4, we believe this starting value 
procedure to be adequate. 

4. Simulation Results 

In the previous two sections we have discussed ML and 
MD estimators for the parameters of the mixture of two 
distributions. In this section we report the results of 
simulations designed to compare these two estimators when 
the component distributions are normal and when they are 
non-normal. In addition we have made our comparisons under 
varying degrees of separation between the two 
distributions. All computations were performed on the CDC 
6600 at Southern Methodist University. 

In our caparison of the MDE and MLE we have begun by 
comparing their performance when the normality assumption 
is valid, i.e., when the component distributions actually 
are normal. We should mention that because of the 
optimality properties of the MLE we would expect that the 
MLE would be superior in this situation, since in practice 
the validity of the normality assumption is subject to 
question, we are also very interested in the performance 
of the MDE and MLE when the component distributions are 
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not normal. To this end we have simulated mixtures in 
which the component distributions are distributed as a 
Student's t with 4 degrees of freedom. We simulated 500 
samples of size n=100 from mixtures of normal and of t(4) 
components for each of the following parameter 

configurations: 

Mixing proportion 
.25 
.50 
.75 


Variances 


2 

= 2 a 


2 

2 


The nature of the mixture model also depends on the 
amount of separation between the two component 
distributions. While, for sufficient separation, the 
mixture model has a characteristic bimodal shape, 
Behboodian(1970) has shown, for example, that a sufficient 
condition for the mixture density (of two normal 
components) to be unimodal is that |y 1 -y 2 |<2min(a 1 ,a 2 ) . Of 
course, in this situation, parameter estimation is 
difficult. 

For purposes of quantifying this separation between 
the components, we will define a measure of "overlap" 
between two distributions. Without loss of generality we 
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assume that population 1 is centered to the left of 
population 2. We define "overlap" to be the probability of 
misclassif ication using the rule: 

i 

Classify an observation x as: 
population 1 if x < x c 
population 2 if x > x c , 

where x c is the unique point between y^_ and y 2 such that 

pf l<x c ) = (1 "P ) f 2 <x c } * 

We have based our current study on "overlaps" of .03 and 
.10. In Figure 1 we display the mixture densities associated 
with normal components and = a~. For each mixture, the 
scaled components pf^x) and (l-p)f 2 (x) are also shown. Note 
that the densities for p=.75 are not displayed here since 
when 0 ^= 0 ^, it follows that f p (x) =f 1-p {y 1 +y 2 -x)where f h (x) 
denotes the mixture density associated with a mixing 
proportion of h. Thus the shapes of the densities at p=.75 
can be inferred from those at p=.25. Likewise, parameter 

estimation for p=.75 is not included in the results of' the 

2 2 

simulations when a ^ 

Although both estmation procedures provide estimates of 
all 5 of the parameters, only the results for the estimation 
of p will be shown since the mixing proportion is the 
parameter of primary interest. In addition, when dealing 
with the non-normal mixtures, the remaining parameter 
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FIGURE X - Mixture Densities Associated with 

2 2 

Normal Components and a. * a_ * 1 



overlap * .10 overlap * .03 
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estimates often do not have a meaningful interpretation. In 
these simulations we have used the procedure discussed in 
the previous section to obtain starting values. It should be 
noted that although we refer to mixtures of t < 4 ) 

distributions here, they are actually mixtures of 

distributions associated with the random variable T’*aT+b, 
where T has a t ( 4 ) distribution. These modifications are 
made in order to obtain the desired separation and variance 
ratios. 

In Table 1 we show the results of the simulation 
comparing the performance of the MLE and MDE. In particular, 

A 

let p.^ denote the estimate of p for the ith sample. Then 
based upon the simulations, estimates of the bias and MSE 
are given by: 

# 

A 1 n <5 A 

bias = — Z s (p.-p) 
n s i-1 1 

A *1 Hg A 2 

MSE = — - 2 (p.-p) , 

s i=l 1 


where n s is the number of samples. It should be noted that 
nMSE is the quantity actually given in the table. In 
addition, we provide the ratio 

MSE (MLE) 

mseTmdeT 

as an efficiency measure. 

Upon viewing the results, it can be seen, as expected, 
that the bias and MSE associated with the MLE were generally 
smaller than those for the MDE when the components were 
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TABLE 1 

Simulation Results Comparing MLE and MDE 

Sample Size • 100 
Number of replications « 500 


lap ■ .10 


NORMAL 


.25 .002 .084 2.25 5.30 .42 
.50 -.009 .005 2.41 2.79 .86 
.75 -.086 -.137 4.87 8.36 .58 


Overlap * ,03 


Bias 

MLE MD 

.008 .02 

.000 .00 


.006 .027 .49 .96 .51 

.009 .008 .42 .44 .95 

-.002 -.024 .47 1.08 .44 


Overlap 


iE 

MDE 


35 

6.18 

1.19 

59 

1.82 

3.07 


Overlap * .03 
nMSE 

MLE MDE 


.44 2.00 
.27 1.74 


061 

.098 

4.63 

5.20 

.89 

.044 

.029 

.95 

i 

.61 

1.56 

028 

.022 

4.49 

1.80 

2.49 

.010 

.001 

.55 

.30 

1.83 

076 

-.058 

7.84 

3.68 

2.13 

-.012 

-.016 

.57 

.36 

1.58 


n times the MSE where n = sample size 
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normally distributed. This relationship between the 
estimators held for both overlaps. The ALE and MDE were 
quite similar at p=.5 while for p».25 and p».75 the 
superiority of the MLE is more pronounced. 

For the t{4) mixtures the relationship between MDE and 
MLE is reversed in that the MDE generally has the smaller 
bias and MSE. The superiority of the MDE in this case is due 
in part to the heavy tails in the t ( 4 ) mixture. The MLE 
often interpreted an extreme observation as being the only 
sample value from one of the populations with all remaining 
observations belonging to the other. Due to the well known 
singularities associated with a zero variance estimate for a 
component distribution, Day{1969), we were concerned that 
the observed behavior of the MLE was due to the fact that we 
did not constrain the variances away from zero. 
However, simulation results in which equal variances were 
assumed (which removes the singularity) and also those which 
used a penalized MLE suggested by Redner(1980) were very, 
similar to those quoted here. 

Although the MSE is a widely used measure among 

statisticians for assessing the performance of an estimator, 

the practical implications, for example, of an estimator 

% 

having an MSE three times larger than that for another 
estimator, may not be immediately apparent. Recall that each 
MSE quoted in Table 1 is based upon 500 estimates of p. In 
order to provide a better appreciation for the practical 
impact of differences in MSE, in Figure 2 we display 
histograms of the 500 estimates of p associated with three 



Figure 2. Histograms of Estimates of p Based 
Upon 500 Samples of Size 100 from 
Mixtures in which p = .5 




different MSEs in the table. The true value of p in each 
case is p* j . It is obvious that as the MSE increases, the 
performance of the estimator deteriorates. Notice that the 
MSE for Figure 2(c) is approximately three times greater 
than the MSE associated with Figure 2(a), while the MSE for 
Figure 2(b) is aprroximately twice that for Figure 2(a). 
Thus, from these histograms, an intuitive feel for 
efficiency ratios of E*2 and E=3 can be obtained. 

A very surprising result is that the starting values 
obtained using the procedure outlined in Section 3 produced 
estimators which were competitive with both the MLE and MDE. 
In fact, for both the normal and t(4) mixtures, the MSEs 
associated with the starting values were lower than those 
for the MDE and MLE for every parameter configuration 
associated with an overlap of .10. At an overlap of .03, 
however, the starting values estimates wete generally poorer 
than those for the MDE and MLE. 

5. Mixtures of Asymmetric Distributions 

The simulation results of the previous section focus on 
the performance of the MLE and MDE under deviations from the 
assumption of normality. However, the t(4) distribution is 
symmetric, and recent studies have indicated that there is 
often a substantial asymmetry in the component distributions 
for variables of interest in agricultural remote sensing. A 
Monte Carlo examination of the performance of the MDE and 
MLE, assuming normal components, when in fact the component 
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distributions were asymmetric, was performed, and the 
results of this examination will be discussed in this 
section. 

For purposes of our examination, we simulated mixtures 

2 

of x (9) distributions with p».5. In these simulations the 

two distributions differed from each other only by a 

location shift. Actually the component distribution to the 

left is x 2 (9) while that to the right is that of a ^shifted" 
2 

X (9) with origin no longer at 0. This shift was varied to 
provide overlaps of .01, .05, and .10. Since our estimation 

procedures .involve a normality assumption, we used the means 

2 

and variances of the two component x (9) distributions and 
the true mixing proportions as our starting values. The 
problem of obtaining starting values from the data in this 
case is being examined. In Table 2 we display the results of 
this simulation. Only when the two component distributions 
were widely separated (overlap*. 01) do the two procedures 
provide reasonable results. However, when the two chi-square 
distributions are not widely separated, both estimators tend 
to seriously underestimate p. In Figure 3 w£ display the 
three mixture distributions on which these simulations were 
based. We see there that it is no surprise that the estimate 
of p is less than .5, especially for p=.10. Both estimation 
procedures view this as a mixture of normals, and therefore 
make the reasonable interpretation that the density to the 
left has a smaller variance and a mixing proportion less 
than .5. These results point out the impact which skewed 
distributions can have on the proportion estimation in the 
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Overlap 


9 


.10 


.05 

.01 


TABLE 2 

Simulation Results 
2 

Mixtures of x (9) Components 

Sample Size - 100 
Nui'ier of replications ■ 200 

p ■ .5 


MLE MDE 


A 

P 

Bias 

nMSE 

A 

P 

Bias 

nMSE 

.28 

-.22 

6.8 

.28 

-.22 

6.6 

.35 

-.15 

2.7 

.37 

-.13 

2.3 

.47 

-.03 

.4 

.45 

-.05 

.5 
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mixture model when normal mixtures are assumed. 

Current investigation into this area centers around 
modifying the estimation procedures by assuming that the 
underlying component distributions belong to some family of 
distributions whose members can be either symmetric or 
asymmetric depending on parameter configurations. At the 
present time, the Weibull distribution is being examined 
concerning its usefulness. 

6. Concluding Results 

We believe that the results of the preceding sections 
are of sufficient substance to motivate further research in 
the area of MD estimation in the mixture model. Our results 
indicate that the MDE is indeed more robust than the MLE in 
the sense that it is less sensitive to symmetric departures 
from the underlying assumption of normality of component 
distributions. Several areas for future investigation have 
already been identified in addition to the asymmetric 
components problem discussed in Section 5. 

First, simulations similar to the ones presented here 
should be performed without the assumption of only two 
populations in the mixture. The performance of the MDE and 
MLE should be compared when the number of populations is 
known and larger than two. In addition the applicability of 
the MDE to the problem of estimating the number of 
populations also warrants investigation. We plan to examine 
these possibilities. 
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Second, the problem of applying the MDE to the multivariate 
setting is of interest. Preliminary indications are that 
such an extension will be possible. 

Third, the choice of distance measure in the MDE is a 
topic of interest. Our results are not meant to imply that 
W 2 is optimal. 

Finally, the MDE and MLE must ultimately be compared on 
real data. Several related practical considerations have not 
yet been investigated. For example, when applying these 
estimators to LANDSAT data, the number of iterations allowed 
must be small due to time constraints. In the simulations 
described here, these constraints were not imposed and 
iteration was allowed to continue until convergence was 
obtained. The performance of the MDE and MLE under 
convergence restrictions should be examined. 
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1. Introduction 

A standard approach to the estimation of crop 

proportions in agricultural remote sensing has been to 

estimate the proportions p^ , p 2 , , , p m in the mixture density 
» 

f(x) = p 1 f 1 (x) + p 2 f 2 (x) + ... + P m f m (x) (1.1) 

where m is the number of components (crops) in the mixture >• 
and f^(x) is the density associated with component i. The 
usual procedure for estimating the parameters in the mixture 
model of (1.1) has been to: 

(a) assume that the component distributions are normal 

(b) use maximum likelihood estimation. 

The variable X has usually been taken to be the 

reflected energy in the four LANDSAT bands or some linear 

combination of these such as greenness or brightness. Recent 

efforts have focused on the use of certain derived features 

from growth models such as g and t as variables in the 

max max 

mixture model. Studies have indicated that there is often a 




substantial asymmetry in the distributions of these features 

for a given crop. Woodward et. al.(1982) have shown that 

asymmetry in the component distributions can cause a 

substantial bias in the proportion estimators when the 

mixture of normals model is assumed. As an example, in 

Figure 1 we display the mixture density associated with the 

mixture of two distributions. Examination of the figure 

reveals that if we assume that the component distributions 

are symmetric, then we must conclude that P^<P 2 and that the 

component to the right has larger variance. Actually, in 

this mixture p^=p 2 and the distribution to the left in this 
2 

mixture is a x (9) while the component to the right is a 
2 

"shifted"x (9)r i.e. its left truncation point is at x=10 
instead of x=0. We see that a bias will be introduced in 
estimating mixing proportions in this mixture if the 
component distributions are assummed to be symmetric, which 
of course is the case when the components are assumed to be 
normal. 

In this paper we will discuss techniques for estimating 
the crop proportions in the presence of asymmetric component 
distributions. In particular the estimation procedures we 
will propose assume that the underlying component 
distributions belong to some family of distributions whose 
members can be either symmetric or skewed depending on 
parameter configurations. At the present time, the Weibull 
distribution is being examined concerning its usefulness in 
this area. The effectiveness of this technique will be 


FIGURE 1 

A Mixture Density 
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examined through simulations 
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2. The Weibull Distribution 

The Weibull distribution is named after the Swedish 
physicist Waloddi Weibull who used it to represent the 
distribution of the breaking strength of materials 
(Weibull (1939) ) . The distribution has been widely used in 
recent years in the fields of reliability and quality 
control. Its popularity is largely due to the flexibility 
which it introduces into the model due to the fact that it 
can be used to describe distributions which are symmetric or 
skewed in either direction. For these , reasons we have chosen 
to investigate its applicability to estimation in mixtures 

three-parameter Weibull 

x >_ a (2.1) 

g,Y > 0 

We will .use the notation X^W(a,b,c) to indicate that the 
random variable X has a three-parameter Weibull distribution 
with parameters a=a, 3=b, and Y=c. The parameter a locates 
the left truncation point and 3 serves as a scale parameter 
while y determines the shape of the distribution. In 
Figure 2 we show Weibull densities for a fixed a and 3 and a 
range of values for y. From the figure it is clear that the 
shape can vary dramatically as Y changes. In Figure 3 the 






of asymmetric components, 
density can be expressed as 


The 


Y x-a Y_1 

f (x) = £ (^L) e p 


FIGURE 2 
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Weibull Densities with a * 0, $ 
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fact that the Weibull density can be skewed to the left as 
well as to the tight is more clearly demonstrated. For 


Y»3. 60232 aprroximately, the standardized skewness parameter 
U3 

g.s — r-r, where Mi is the ith central moment, is zero 
x 

indicating symmetry. If Y<3. 60232 then the Weibull is skewed 
to the right, while if y> 3. 60232 it is skewed to the left. 
The Weibull distribution is unimodal, and if Y>1 the mode 
occurs at 

Y-l 1/Y 

x m = “ + s(I Y i) 


Otherwise, when 0 <y< 1, the mode occurs at x m =a. 

Dubey(1967) has studied the Weibull distribution when 
Y“3. 60232 and has concluded that it is very similar to the 
normal. In particular, Dubey has shown that 

sup | F g ( V ) - F y (v)| a .0078 (2.3) 

-3<AK3 

where F z denotes the cumulative distribution function of the 

random variable Z^NfO,!) and Y is the standardized variate 

2 

Y=(X-ii)/a where where y and cr are the mean and variance of 
the Weibull variate X. 

It should be noted that the Weibull distribution is 
often given in the literature in two parameter form in which 
a is assumed to be known (and usually 0). However, unless 
otherwise specified, when we refer to the Weibull 
distribution, we will be referring to the three-parameter 
form specified by (2.1). 

The cumulative distribution function corresponding to 


m 
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the three-parameter Weibull is 
expression 


F X U> 


1-e 


/X-0( v Y 

'T“ ) 


given 


by the closed form 

(2.4) 


7 


while the noncentral moments are given by 

u ' = l (£) a r “ k 3 k r <£ + 1 ) 
r k=0 K Y 

From (2.5) it can be seen that 

y = a 4- 0r(i + 1) 

a 2 = 3 2 (r (|- + 1) - r 2 (i + 1) } . 


(2.5) 


( 2 . 6 ) 


The first three moments of the Weibull distribution 
determine the values of a, 3, and y. The method of moment 
estimators can be obtained using these relationships, but 
unfortunately the estimators do not exist in a closed form. 
The log-likelihood function for a random sample of n 

observations from the Weibull distribution is 

n n 

&n(L) = n£ny -nyiln3 + (y-1) $ ^h(X.-a) - J (X.~a) Y (2.7) 

i=l 1 3 ' i=l 1 


Differentiating ln(L) yields the following 
equations 

n n 

- (y-1) l (X.-a)' 1 + l (X.-a)^ 1 = 0 
i-1 sYi=1 

e = ik.l «*!-«> T i l/T ' 

1=1 


likelihood 


( 2 . 8 ) 


n X.-a X.-a y -1 

Y = (.1 [*n(-2y — )][(^ r ) - 1]} 
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A A A 

Let a, 3 , and y denote the estimators obtained from the 

A 

simultaneous solution of equations (2.8) to (2.10). If 0 <ct£Y^, 
'/bore Y^ denotes the ith order statistic, these estimators 

t r 

are the maximum likelihood (ML) estimators for the three 
Weibull parameters. However, due to the restriction :;>a in 

A 

(2.1), if a>Y 1 , then the MLE of a is taken to be Y 1 and 
8 and a are estimated from (2.9) and (2.10). As in the case 
of method of moment estimators, the ML estimators do not 
have a closed form expression. For a general review of the 
literature on Weibull parameter estimation see Johnson and 
Kotz (1970) . 


3. Mixtures of Weibull Distributions 


In order to examine the feasibility of using the 
Weibull as a model for the component distributions in the 
mixture model of (1.1), we will investigate the estimation 
of the parameters in the mixture of two Weibull 
distributions. This mixture density is given in (3.1) 


f(x) = p 




x - a l, Y l 


-) 


To X-a 2 

+ (1 -p> FThrr* 
p 2 M 2 


• X-a, Y 

V 1 -<t^) 


2 

(3.1) 


where the 7 parameters p, ou, , a 2 , 0 2 • and Y 2 are 
assumed to be unknown. 

Previous research in this area includes that of 
Kao(1959), who proposed a graphical procedure for estimating 
the parameters in (3.1) when one of the location parameters 
is assumed to be known and equal to zero. The estimation of 
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the 6 remaining parameters is accomplished using a graphical 
procedure whose applicability to our problem seems to be 
limited although some of his estimation rules could be 
automated. Rider (1961) and Palls (1970) propose estimating 
the parameters of a mixture of two-parameter Weibulls using 
the method of moments. Falls’ procedure involves estimating 
the mixing proportion p using a graphical ' procedure similar 
to that of Kao. 

Maximum likelihood estimation of the parameters of 
(3.1) has been discussed by Looney and Bargmann (1982) . The 
likelihood equations obtained by differentiating the 
log-likelihood function ln(L) 
n 


*n(L) 


= l Un[pf (X •) + 
i=l 1 x 


(l-p)f 2 (X.)]} 


with respect to each of the 7 parameters yields the 


likelihood equations 
n 

(Y 4 -D I f (j|x.) (X.-a.) 

3 i=l i i J 

n 


-1 Y. 


n 


Y 


l f (j|X.) (X.-a.) j " 1 =0 / j=l,2 


n 


g? 3 .- 

: 1=1 


Y- ** 1/Y4 

3j = {[I (X i ~a j ) J f(j|x i )]/ ^f(j|X.)} 3 , 3 = 1,2 

i=l 
n 


(3.2) 

(3.3) 


X,-a. y. 


i=l 
X.-a . 


n 


Yi={[ I ((-V 2 -) J-l)An(-± ff -i)]/ l f(j|X.)} _i j =1 r 2 (3.4) 

3 i=l Pj p j i=l 1 


n 


(3.5) 


P = fal^) 

1=1 

where f(i|x) = p^f^(x)/f(x) with f^(x) denoting the ith 


component density and f(x) the mixture density. Solving this 
set of equations for the maximum likelihood estimators is 
difficult due largely to equations (3.2), which are not in 
fixed point form. Looney and Bargmann (1982 ) suggested a 
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procedure in which the shape parameters Y]_ and Y 2 are fixed 
independently at each of the values 


ri 1 1 1 2 , 3 ? 3 

l 5’» 4> T' 7» J* x ' J' * ' 


4, 


5} 


A A 

and, for each of the ( Y]_* Y 2 ) pairs, ."preliminary" maximum 
likelihood estimates of the remaining 5 parameters are 

A A 

found. A search procedure results in selecting the (T 1 , Y 2 ) 

A A 

pair for which ln(L) is maximized. With Y^ and Y 2 fixed at 
these values, maximum • likelihood estimation for the 
remaining 5 parameters is then carried through to 
convergence. The Looney and Bargmann procedure for solving 
the system of equations (3.2) - (3.5) seems overly 
restrictive with respect to the selection of possible values 
of the shape parameter, while expansion of the search 
procedure to allow for more shape parameter values would 
probably be prohibitive because of time constraints. 
However, solution of these likelihood equations directly 
appears to us to be quite untractable. For these reasons, we 
have investigated the use of minimum distance (HD) 
estimation, first introduced by Wolf owitz (1957 ) , for 
estimating the 7 parameters in the mixture of Weibulls model 
given in (3.1). Woodward et. al. (1982) have recently studied 
the use of MD estimation in the mixture of normals model. 
These authors showed that MD estimation was easy to 
implement in that setting, and that MD estimators showed to 
be superior to ML estimators under departures from component 
normality. Since our use of Weibull components is due to the 
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flexibility which it introduces into the model rather than 
underlying theoretical justifications, we definitely need an 
estimation procedure which is robust to departures from 
assumptions. 

The minimum distance estimator of the parameter 9 

(possibly vector valued) is defined to be that value of 6 

which minimizes the distance between H Q and F where 

u n 

ff={Hg:0efi} denotes a family of distributions depending on 6 

and F n denotes the empirical distribution function, i. e. 

F (x)=k/n where k is the number of observations less than or 
n 

equal to x. The family of distributions h is referred to as 
the projection model, where 9 in this case 
0 = (p, o^, 3 •]_/ Yjf ct 2 ' &2' ^ 2 ! ' and H e( x ) distribution 

function associated with a mixture of two Weibull components 


given by 

Hg (x) = p[l-e 


X-a, Yi 

-'TT* 


] + (l-p)[l-e 


^” a 2 ^2 
- (— 

P n 


11 


] • 


(3.6) 


Note that in contrast to the situation in which the 

projection model is taken to be the mixture of two normals, 

Hg(x) in (3.6) has a closed form expression. The choice of 

distance function to be used to measure the distance between 

two distributions is a topic of current interest in the, 

field of MD estimation. Woodward et. al. (1982) used the 

2 

Cramer-von Mises distance, W , given by 

00 

W 2 = / [G^ (x) -G 2 (x) ] 2 dG 2 (x) (3.7) 

where and G 2 are two distribution functions, and we have 
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chosen to use this distance measure in the current study. 
The distance between a distribution function H 0 and the 
empirical distribution function F n , which is needed for 
calculation of the MD estimator, is given by the simplified 
expression 

n 

w n = ik- + I [ VV - ^r 1 ' < 3 - 81 

i=l 

where denotes the ith order statistic. Since H 0 (X) exists in 
closed form, the MD estimator in this case is easily 
obtained by using nonlinear least squares techniques to 
minimize (3.8). We have chosen to perform this minimization 
by using Marquardt 1 s (1963 ) procedure. 

4. Simulation Results 

In Section 3 we discussed the problem of estimation in 
the mixture of Weibulls model. From that discussion it 
appears that the minimum distance techniques are preferable 
for • estimating the parameters in a mixture of three 
parameter Weibulls, especially in terms of computational 
convenience. In this section we will discuss the results of 
an initial computer simulation which was designed fcr use in 
evaluating the numerical capabilities of, this method. All 
computations were performed on the CDC 6600 at Southern 
Methodist University. In this section we will evaluate the 
performance of the MD estimation procedures discussed. Since 
the usual procedure is to assume that the' components are 
normal, we will compare the Weibull based ilDEs with the 
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normal based procedures. We have generated samples from 

2 

mixtures of normal components and mixtures of X (9) 
components. Obviously, we would expect the normal based 
procedures to perform better than Weibull based procedures 
when the mixture really is a mixture of normal components. 
However, if the Weibull techniques are to be useful, then 
they must give reasonable results in this situation since 
the normal assumption does appear to be a reasonable 
assumption in some cases. Since the Weibull with Y=3.6 is 
very nearly normal, there is reason to believe that Weibull 
procedures will perform well in this situation-, We have not 
simulated samples from mixtures of Weibull distributions, 
but we plan to consider this in the future. Of course, as 
mentioned in the previous section, we are most interested in 
the performance of the Weibull based procedures when the 
underlying components from which we sample are not 
necessarily Weibulls, but are realistic representativs of 
the types of component distributions we see in practice. 

Our simulation results are based on 200 samples of size 

2 

n=200 from mixtures of normal and of x (9) components. In 
each mixture, the variance associated with the two 
components are equal. In fact, the two component 
distributions differ from each other only by a location 
shift. We have simulated from mixtures having mixing 
proportions of .25, .50, and .75. We have simulated from 

mixtures with varying degrees of separation between the two 
component distributions. Overlap as defined by Woodward 
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et.al.(1982) is a quantification of this separation. It is 
defined as the the probability of misclassif ication using 
the rule: 

Classify an observation x as: 
population 1 if x < x c 
population 2 if x > x 

c 

where without loss of generality, population 1 is assumed to 
be centered to the left of population, and where x q is the 
unique point between u 1 and y 2 such that 

P f l(X c ) = ( !”P) f 2 (X q) • 

We have based our current study on "overlaps" of .03 and 

.10. In Figure 4 we display the mixture densities associated 

with normal components. For each mixture, the scaled 

components pf^x) and (l-p)f 2 (x) are also shown. Mote that 

the densities for p=.75 are not displayed here. Since a^=cr 2 , 

it follows that f p (x) y^+y^x) where f p (x) denotes the 

mixture density associated with a mixing proportion of p. 

Thus the shapes of the densities at p=.75 can be inferred 

from those at p=.25. Likewise, parameter estimation for 

p=.75 is not included in the results of the simulations for 

the mixtures of normals. In Figure 5 we display the mixture 

2 

densities associated with the mixtures of x (9) components. 

2 

Note that although we refer to a mixture of x (9) 
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FIGURE 4 


Mixture Densities with Normal Components 






FIGURE 5 

, 2 
Mixtures Densities with x (9) 
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distributions here, they are actually "shifted" chi-squares, 
i.e. the left truncation points are different from zero. 

For each of the simulated samples, three sets of 
parameter estimates were obtained: 

(1) ML estimates based on mixture of normals model {MLEN) 

(2) MD estimates based on mixture of normals model (MDEN) 

(3) MD estimates based on mixture of Weibulls model (MDEW) 

Although the MLEN and MDEN provide estimates of all 5 of the 
parameters of the mixture of normals model, and the MDEW 
produces estimates for all 7 parameters in the mixture of 
Weibulls model, only the results for the estimation of p 
will be shown. The mixing proportion is the parameter of 
primary interest, and when dealing with the "wrong-model" 
situations, the remaining parameter estimates often do not 
have a meaningful interpretation. For purposes of aiding in 
the discussions which follow, we will call a component model 
from which we actually simulated, a "simulation component 
model", while a component model which is assumed under a 
particular estimation procedure will be called an 
"estimation component model". Thus, a "wrong-model" 
situation is one in which the simulation component mod.els 
are not the same as the estimation component models. 

In the "correct-model" situations, i.e. using the MLEN 
or MDEN to estimate the parameters of a simulated mixture of 
normal components, the true parameter values are used as 
starting values for the iterative estimation procedures. In 
all of the other cases, there is not a "true" 


set of 
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parameters. For starting values, we have used the "true" 

mixing proportion, and then estimated the parameters of each 

component separately using a method of moments procedure. 

Consider a situation in which the estimation components are 

normal. We obtain starting values for each component by 

equating the first and second moments of the corresponding 

simulation and estimation components and using these to 
2 

obtain y^and a i for the normal estimation component. When 
the estimation components are Weibull, v/e have taken the 
approach of setting the starting value for Y at Y =3.6 for 
each component. Then the first two moments of the 
corresponding simulation and estimation components are 
equated to yield starting value estimates for the other two 
parameters. We believe that this provides a "neutral" start. 
If the final estimates reflect the finding of substantial 
skewness for one or both of the component Weibulls, this 
will be because of the data and not because of "skewed" 
starting values. 

The normal component models were generated with y^ = 7 • 5/ 

2 2 

an< ^ ^2 positioned so that the desired overlap is 

obtained. As mentioned previously, both components in the 

chi-square mixtures were "shifted" chi-squares. In our 

simulations, the left truncation point for population 1 was 

always taken to be 7.5, and for population 2 it was located 

so that the desired overlap was obtained. In the MLEN and 

2 2 

MDEN procedures, the natural constraints a 1 >0 , cr 2 >0, and 
0<_p<l were imposed. Similarly, for the MDEW, the natural 


m 
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constraints 0^>o, y 1 >0 f 8 2 >0, T 2 >0 ^ and °iPl 1 were imposed 
along with the constraints ot^O and a 2 >0 which are 
reasonable constraints on the left-truncation point which 
would be imposed due to physical considerations/ etc. 

In Table 1 we display the results of the simulations. 
For a given simulation model and estimation procedure/ we 

A 

will obtain an estimate p of p, defined by 


where p^ is the estimate of p for the ith sample, and n g is 

the number of samples. Then based upon the simulations, 

estimates of the bias and MSE are given by: 
n 

A 1 ® A A 

bias - — -l (p i -p) = P - P 
s i=l 


n s A 

MSE = l ( Pi -p) 2 . 

S i=l 

Upon viewing the results, it can be seen that the MDEW 

* 

was competitive when, the component models were actually 
normally distributed, and it produced the best overall 
results for the, chi-square mixtures. Of particular interest 
is the chi-square mixture where p=.5 and overlap=.10. This 
is the mixture displayed in Figure 5e and also in Figure 1 
(except for location shift). When symmetric components are 
assumed (as with the ML’EN and MDEW) , a bias does occur in 
the estimation of p as discussed in Section 1. This behavior 
has been notea previously by woodward,, et. al. (1982) . We see 
front the table that the MDEW performs substantially better 
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Table 1 - Simulation Results 

Comparing Normal Based with 
Weibull Based Estimation Procedures 

Sample size - 200 
Number of ropititions ■ 200 

Mixture of Normals 



A 

P 

Overlap » 
Bias 

.10 

MSE 

A 

P 

Overlap ■ 
Bias 

,03 

MSE 

MLEN 

.27 

.02 

.022 

.25 

.00 

.022 

MDEN 

.37 

.12 

.074 

.26 

.01 

.004 

MDEW 

.34 

.09 

.044 

.30 

.05 

.011 

MLEN 

.50 

o 

o 

• 

.014 

.50 

.00 

.002 

MDEN 

• 

VO 

O 

• 

1 

.023 

.47 

-.03 

.002 

MDEW 

. 

-C' 

oo 

-.02 

.019 

.51 

.01 

.004 


Mixture of x^(9) 




A 

’ P 

Overlap - 
Bias 

.10 

MSE 

A 

P 

Overlap ■ 
Bias 

.03 

MSE 


MLEN 

.24 

-.01 

.061 

.18 

-.07 

.0C6 

25 

MDEN 

.41 

.16 

.098 

.17 

-.08 

.008 


MDEW 

.50 

.25 

.122 

.29 

-.04 

.007 


MLEN 

04 

• 

-.23 

.064 

.45 

-.05 

.011 

50 

MDEN 

.26 

-.24 

.061 

.41 

-.09 

.010 


MDEW 

.42 

-.08 

.024 

.50 

.00 

.004 


MLEN 

.50 

• -.25 

.070 

.65 

-.10 

.013 

75 

MDEN 

.48 

-.27 

.085 

.64 

-.11 

.016 


MDEW 

.62 

-.13 

.032 

.71 

.04 

.005 
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than either of these normal based procedures on the basis of 
both bias and MSE. In Figure 6 we display histograms of the 
200 estimates of p obtained from the three estimation 
procedures for the chi-square mixture shown in Figure 5c. It 
can be seen there, that the normal based procedures 
consistently estimated p to be substantially less than .5 
while the estimates based on Weibull components were in 
general closer to the true vaue p=.5. 

The one case in which the Weibull based estimates were 
not best, was when p*.25 with overlap®. 10. This mixture is 
displayed in Figure 5a where it is obvious that estimation 
should be difficult since there is no distinct contribution 
due to component 1 in the mixture. Indeed, all procedures 
yield poor estimates as measured by the high MSEs. In Figure 
7, we display histograms of the p values obtained from the 
three estimation procedures for this set of parameter 
configurations. There it can be seen that the Weibull 
procedure certainly gave the poorest results, with estimates 
being spread nearly uniformly between 0 and 1. However, the 
normal based procedures also had difficulty as is reflected 
in the histograms. In fact, there appears to be a tendency 

A 

for the p. values to be very low (approximately .10). 

A 

However, p is very close to .25 for the MLEN since several 

A 

of the p^ values were spread out uniformly between 0 and 1, 
which increased the estimate of p .to near .25. However, the 
large MSE shown in the table for this case reflects this 
lack of accuracy. 




sa 
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5. Concluding Remarks 

Results in this report and in the report by Woodward, 
et.al.(1982) indicate that the normal based procedures 
perform poorly in the presence of a mixture of asymmetric 
distributions. In this paper we have suggested the mixture 
of Weibulls model as an alternative to the mixture of 
normals model in this situation. Results indicate that 
minimum distance estimation of the parameters of a mixture 
of Weibulls is a viable alternative to the normal-based 
techniques currently in use. 

Before this procedure could be recommended and 
implemented, further research is needed. For example, the 
problem of how to obtain starting values for the parameters 
of mixtures of possibly asymmetric components has not been 
resolved. Also, the Weibull based procedures should be 


applied to 

LANDSAT 

data 

in 

order 

to 

examine 

their 

performance 

on the 

types 

of 

asymme 

try 

which 

will 

be 

encountered 

in practice. 

The 

fact 

that 

an 

additional 

parameter has 

been 

introduced 

into 

the 

model 

for 

each 


component has caused the estimation procedures to be slower 
than for the normal based procedures. Further investigation 
concerning the practical aspects of actually implementing 
the procedures is needed. 
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APPENDIX D 


1 

i 





I 



Critique of FCPF Automatic and Semi-Automatic 
Proportion Estimator Results 

W.A. Woodward 

The following discussion will concern the results given 
by FCPF in two recent NASA documents (1,2) concerning the per- 
formance of three new automatic and semi-automatic proportion 

\ 

estimation techniques. There is much overlap in. these two 
documents although the data and conclusions in (2) represent 
revisions and additions to those given in (1) . For this reason, 
the current report will concentrate mainly on the data and con- 
clusions in (2). Before proceeding further it should be pointed 
out that we simply cannot draw certain inferences without access 
to the data itself. However, we will draw whatever conclusions 
we feel are warranted from the information provided. 

Since we have no first hand experience with implementing 
any of the procedures involved, we will make no remarks concerning 
the implementation aspects of the various methods. Instead we 
will restrict ourselves to questions surrounding the quality of 
the proportion estimators being considered. This quality should 
be viewed from the perspective of how the new estimators compare 
with the current state-of-the-art analyst -intensive estimators as 
well as from a more absolute viewpoint concerning simply whether 
or not the new estimators meet acceptable standards. 

Some useful information concerning the performance of the 
estimators is given in the table on page 2-22 of (2), Before continuing 
with the discussion of the results of this table some words of 


- 2 - 
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caution should be given. First, the historical procedure is not 
a procedure but rather the various stages of the current P1A pro- 
cedure during its evolution. Since 1979 is the only year during 
which the current state-of-the-art P1A technique was utilized 
the 1979 comparisons would be of most value from this perspective. 
In addition, it is our understanding that the FCPF techniques were 
developed using 1978 and 1979 data. We would expect procedures to 
perform best on segments undergoing similar weather patterns, etc. 
to the ones on which the techniques were based. Third, the 1976 
historical data is based upon ratloed spring wheat rather than 
spring small grains. For this reason the 1976 comparisons will 
not be of much interest to us. Finally, it is impossible to tell 
from the table what the relationship is, for example, in 1977 among 
the 38 SSG4 segments, the 25 SSG3C segments, the 37 SSG3B segments 
and the 45 historical segments. This should be made clearer as it 
has a bearing on interpretation. 

It is unfortunate that the mean absolute error (MAE) is not 
available for the historical procedures. When dealing with biased 
estimators, as the historical ones seem to be, the MAE and the mean 
squared error (MSE) are more informative measures than the standard 
deviation. The sample MSE is given by: 


n 



i=l i=l 


The MAE and MSE measure the amount of spread in the data around 

* 

the ground truth whereas the standard deviation measures the 
amount of spread around the sample mean which may be quite 


- 3 - ORIGINAL PAGE IS 

OF POOR QUALITY 

» 

different from the ground truth. From the data on 2-22 the MAE 

values cannot be computed for the historical procedures. However, 

1 2 ***"2 

since MSE S + e , we can calculate MSE values for each proce- 

n e ’ 

dure and each year. These MSE values are presented below: 


Mean Squared Error Values for Data 
of Table 2-22 



1976 

1977 

1978 

1979 

SSG4 

131.9 

136.0 

110.0 

182.3 

SSG3C 

99.7 

147.9 

131.1 

354.7 

SSG3B 

75.2 

92.5 

121.3 

311.9 

Historical 

100.9 

65.7 

69.4 

46.1 


The results of this table indicate that for 1977-1979 the MSE 
values for the new procedures, are two to six times as large as 
those for the historical procedures whereas MSE values for 1976 
are similar. This reinforces the information given on page 2-22 
concerning standard deviation comparisons. In comparing 
estimators, unbiasedness is usually not as important a criterion 
as MSE,i.e. the estimator with smallest MSE is usually favored 
regardless of the bias properties of the estimators. 

Thus the historical techniques seem to be substantially 
better than the new procedures on the basis of MSE for 1977-1979 
with results in 1979 being quite desparate. On page 2-40 the claim 
is made that when acquisitions can be made in windows 2, 3, and 
4 (which is the optimal sampling situation) then SSG4 performs 
well since e = .04. However, along the lines of our previous 
arguments, SSG4 in fact does not perfoi-m well in this most optimal 
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of situations since the associated MSE is 124. From an absolute 
perspective, it simply appears to us that the MSE's for the FCPF 
procedures are too high. 

At this point, brief comment will be made concerning the 

bias properties of the estimators. A 90% confidence interval 

(based upon normality) about the true bias of a procedure is 

given by c , 

e _ S e 

(e " t .95(n-l) ' e + t .95(n-l)"^r— 5 

where t 95 ( n _^j is the/.., 95 percentage point of the t distribution 
with n-1 degrees of freedom. If this interval contains zero 
then the bias is not significantly different from zero whereas 
if this interval does not include zero, the bias is concluded to 
be significantly different from zero at the a ■ .10 level of 
significance. It is obvious that the effect of large standard 

deviation or of small n is to lengthen this interval. Said another , 

way, the result of a larger standard deviation or a smaller sample f 

size jLs jto decrease the power of the test i.e. decrease the prob~ l 

S 

ability of rejecting unbiasedness when an estimator is really biased . 

In the present setting the failure to reject unbiasedness in the FCPF 
estimators is largely a function of larger standard deviation and 
smaller sample size than it is due to smaller bias estimates. In 
fact, if for each year the FCPF estimates had the same standard 
deviation and sample size as that associated with the historical 
data, only SSG3C and SSG3B in 1978 would have yielded estimates 
for which the bias was not, significantly different from zero. In 

i 

/? ' / ■ 

■f ; 

M I? I 
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summary , although the sample biases were generally smaller for 
FCPF estimates than for historical estimates, we certainly agree 
with the statement on 2-3 that there Is no significant difference 
in the biases of the two procedures. The authors were not as 
careful In their statement on 2-19. It should be pointed out 
again however that MSE is a measure of the goodness of an esti- 
mator which is appropriate for comparing estimators whether they 
be biased or unbiased. 

The implications of the large MSE's are evident in other 

2 

data presentations in (2). The small r values in pages 2-23 
through 2-25 are the ones associated with large MSE's with the 
1979 results for SSG4, SSG3C, and SSG3B being extremely noteworthy. 
One would certainly be hesitant to recommend a procedure which 
yielded results as unrelated to ground truth as were the FCPF 
results in 1979. The claim on page 2-19 that the lack of "good" 
correlation in 1979 for dSG4 is explainable, seems to be questionable. 
If the outlier point is deleted, the correspondence between ground 
truth and SSG4 estimates is still poor. Consider a vertical line 
drawn through ground truth proportion .25 on the 1979 SSG4 plot. 

It can be seen that there is very little correlation between ground 

A 

truth and p on either side of the line. The correlation which does 

appear is only due to the fact that SSG4 seems to do a fair job of 

separating low ground truth proportions from high ones. Another 

2 

word of warning concerning interpretation of r values is in 
2 

order here. The r value measures the amount of fit to the line 
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which best fits the data. If this line is not approximately 
the 1-1 line, i.e. the line with slope of 1 and intercept of zero 



* 


then this fit is of little importance. The line draw© in the 

plots on 2-23 through 2-26 is the 1-1 line. It is clear that 

in some of the plots this line is not the best fitting line 

2 

whose fit to the data is being measured by r . A more meaning- 
ful measure of fit would be one which measures departure from 
the data to this 1-1 line. It is easily shown that MSE is the 
average squared vertical distance from the data points to the 

1- 1 line. 

We will conclude with a few additional comments. On page 

2- 50, data are given concerning processability rates of the proce- 
dures. In 1978 data from the second satellite were available which 
should have produced a higher processability rate. This Increased 
processability rate is visible in the 3 FCPF procedures but is 

not visible in the 1978 historical data. Since SSG4 processability 
for 1976, 1977, and 1979 was approximately 12-20% lower than that 
for the historical, and since for 1978 the SSG4 rate was 24% higher 
than the historical rate, there seems to be cause for concern 
relating to the validity of the 1978 historical processability 
rate. The error characterization analyses were interesting and 
should indeed provide useful information concerning possible modi- 
fications of the FCPF techniques. It is not clear of course whether 
or not modification in the procedure will be able to improve per- 
formance . 

In conclusion we feel that the results of the comparisons 
between the FCPF automatic and semi-automatic procedures and the 
historical results are not very encouraging. Although results 
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and cautions are in general adequately related In (2) It Is our 
opinion that the apparent unbiasedness of the FCPF procedures 
resulted in excessive optimism concerning their performance. 

The problem with excessive variance with the FCPF procedures 
was mentioned but seemingly did not cause great concern possibly 
because of ''apparently" larger bias for the historical procedures. 
However our analyses involving the MSE as the standard for com- 
paring estimators indicates that indeed the FCPF procedures do 
not perform at the level of historical aria,lyst-inten$ive techniques. 


Refer fences 


1. Preliminary Technical Results Review of FY81 Experiments, 

Volume 1, Fiscal Year 1981/82 Spring Small Grains Pilot 
Experiment, FC-J1-04175, JSC-17433, September 23, 1981. 

2. Semi-Annual Project Management Report, Program Review Presenta- 
tion to Level 1, Interagency Coordination Committee, 
FC-J1-04181, JSC-17438, November 1981. 



Review of 

A Crop Area Estimator Based on Changes 
in the Temporal Profile of a 
Vegetative Index 
(Smith and Ramey) 

by Wayne A. Woodward 

The paper by Smith and Ramey contains some interesting 
ideas concerning the use of temporal data in estimating a 
vegetative index. I have several comments concerning the 
paper s 

1. Although I am not extremely familiar with the Cubic 
Color Model of Cates, et al, I do want to make some 

comments about it. To me it appears that normalizing the 
readings from channels 1,2, and 4 based upon the segment 
means in each of these bands can have some undesirable 
effects. For example, if early in the season an 
acquisition is taken when a majority of the segment is 
bare soil, then on the CIR film it would appear mostly 
green in color. However, the normalizing procedure of 
dividing by segment means would assign what is very 
probably green (on CIR film) to the neutral gray 

position. Consequently, a pixel with relative energies 
of (4.99,4.98, 5.01) would be called red, and placed in 

the vegeta.tive class when in fact it was represented as 

green on the film, and was probably nonvegetatea. The 
same phenomenon could of course occur in reverse. In 
addition, the fact that (4.99,4.98,5.01) and (0,0,10) 


C -a 


are assigned the same "color” seems like an unfortunate 
loss of information. 

2. The fact that the procedure does no more than provide 
a vegetative index, implies that it will not, of course, 
be able to provide proportion estimates for individual 
crops. 

3. The shortcomings which the authors list on page 9 are 
quite serious. The fact that the underlying profiles are 
not separated by a constant violates a basic assumption 
in the multiple regression (or analysis of -covariance) 
model posed. Also, although only the parameter alpha is 
of interest, it is likely that if estimation is a 
problem, estimates of alpha will suffer along with those 
of the betas. 

4. Finally, the results of the technique as applied to 
segment data requires some comment. To me these results 
from the 10 segments seem quite unimpressive. The 
magnitude of the errors is unacceptably high, and the 
authors' .statement on page 11 that the technique 
"apparently produced unbiased estimates" is completely 
unfounded. It seems that the authors believe that the 
impressive feature of their results is the "high" 
correlation of .73 between observed and expected percent 



changes. Observing the 10 pairs of values upon which 
this correlation is based reveals that there actually 
does not seem to be a strong correlation between these 
values. In fact, when the results for segment 1658 are 
removed from the data set, then the correlation is only 
.33. (My calculations showed a correlation of .67 
instead of .73 for the data shown.) Examination of the 
data in the table shows that both observed and expected 
percent change for this segment were much larger than 
those from other segments. This data pair thus had an 
inordinate influence on the correlation coefficient 
(sometimes called the "lollipop" effect.) If the 

nonparametric Spearman correlation coefficient had been 
used instead of the Pearson correlation which depends 
upon a bivariate normality assumption, the corlelation 
using all 10 data pairs would have been only .43, 'again 
an unimpressive result. 

In short, the results of this paper certainly do not 

convince me that the technique proposed here has any merit. 


